Max Margin AND / OR Graph Learning for Efficient Articulated Object
نویسندگان
چکیده
In this paper we formulate a novel AND/OR graph representation for parsing articulated objects into parts and recovering their poses. The AND/OR graph allows us to handle an enormous variety of articulated poses with a compact graphical model. We develop a novel inference algorithm, compositional inference, that uses a bottom-up compositional process for proposing configurations for the object. The strategy of surround suppression is applied to ensure that the inference time is polynomial in the size of input data. We present a novel structure-learning method, Max Margin AND/OR Graph (MM-AOG), to learn the parameters of the AND/OR graph model discriminatively. Maxmargin learning is a generalization of the training algorithm for support vector machines (SVMs). The parameters are optimized globally, i.e. the weights of the appearance model for individual nodes and the relative importance of spatial relationships between nodes are learnt simultaneously. The kernel trick can be used to handle high dimensional features and to enable complex similarity measures to discriminate between object configurations. We applied our approach – Long (Leo) Zhu Department of Statistics University of California at Los Angeles Los Angeles, CA 90095 E-mail: [email protected] Yuanhao Chen University of Science and Technology of China Hefei, Anhui 230026 P.R.China E-mail: [email protected] Chenxi Lin Microsoft Research Asia E-mail: [email protected] Alan Yuille Department of Statistics,Psychology and Computer Science University of California at Los Angeles Los Angeles, CA 90095 E-mail: [email protected] the AND/OR graph representation, compositional inference and max-margin learning – to the tasks of detecting, segmenting and parsing horses and human body. We demonstrate that the inference algorithm is fast and analyze its computational complexity empirically. To evaluate max margin learning, we perform comparison experiments on the horse and human baseball datasets, showing significant improvements over state of the art methods on benchmarked datasets.
منابع مشابه
Max-Margin Learning of Hierarchical Configural Deformable Templates (HCDT) for Efficient Object Parsing and Pose Estimation
In this paper we formulate a hierarchical configurable deformable template (HCDT) to model articulated visual objects – such as horses and baseball players – for tasks such as parsing, segmentation, and pose estimation. HCDTs represent an object by a AND/OR graph where the OR nodes act as switches which enables the graph topology to vary adaptively. This hierarchical representation is compositi...
متن کاملExponential Family Graph Matching and Ranking
Abstract We present a method for learning max-weight matching predictors in bipartite graphs. The method consists of performing maximum a posteriori estimation in exponential families with sufficient statistics that encode permutations and data features. Although inference is in general hard, we show that for one very relevant application–document ranking–exact inference is efficient. For gener...
متن کاملProbabilistic models of vision and max-margin methods
It is attractive to formulate problems in computer vision and related fields in term of probabilistic estimation where the probability models are defined over graphs, such as grammars. The graphical structures, and the state variables defined over them, give a rich knowledge representation which can describe the complex structures of objects and images. The probability distributions defined ove...
متن کاملSemi-Supervised Learning with Max-Margin Graph Cuts
This paper proposes a novel algorithm for semisupervised learning. This algorithm learns graph cuts that maximize the margin with respect to the labels induced by the harmonic function solution. We motivate the approach, compare it to existing work, and prove a bound on its generalization error. The quality of our solutions is evaluated on a synthetic problem and three UCI ML repository dataset...
متن کاملMax-Margin Deep Generative Models for (Semi-)Supervised Learning
Deep generative models (DGMs) are effective on learning multilayered representations of complex data and performing inference of input data by exploring the generative ability. However, it is relatively insufficient to empower the discriminative ability of DGMs on making accurate predictions. This paper presents max-margin deep generative models (mmDGMs) and a class-conditional variant (mmDCGMs...
متن کامل